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ABSTRACT 

As cloud computing is gaining more recognition as a public utility which gives the client room to focus on his 
work without focusing on installation and maintenance of other important devices, as they are installed and 
maintained by the cloud service providers. Cloud computing is meant to be scalable, and enhance the quality of 
service (QoS), cost effective and also simplified user interface so that the customer can appreciate the idea 
behind cloud computing. In dealing with resource allocation, the client's request has to be execute through 
various stages, in case there are queue of requests waiting to be served in each stage. Therefore, queuing 
methods are required to solve this kind of situation. In this paper, we focused on mathematical formulation 
using queuing system technique to show how throughput and time delay of a system may varies between a single 
server system and a multiple server system in a cloud-computing environment. 
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I. INTRODUCTION 

The management of resources requires putting a limited access to the pool of shared resources. No matter 
what kind of resources you are dealing with, it also controls the status of current resource consumption. 
Resources in Information Communications Technologies (ICT) are the fundamental elements like hardware part 
of the computer systems, data communications and computer networks, operating system and software 
applications. Since the number of these resources is limited, it is important to restrict access to some of them. So, 
we can ensure an SLA (Service Level Agreement) between the customers who are requesting resources and 
providers who are the owners of the systems. Main resource sharing function of a distributed computer system is 
to assign user requests to the resources in the system such that response time, resource utilization, network 
throughput are optimized. 

Over the decades, available ICT systems used in the development of internet and distributed systems gave 
computer users an opportunity to access and exploit the different resources in those systems. Recently, we have 
a new term in the area of computing, namely cloud computing which is a technological adaptation of distributed 
computing and internet. The main idea behind cloud computing is to allow customer access to computing 
resources through the web services in an efficient way. Cloud based network services are provided by virtual 
hardware which do not physically exist, and thus scale up and down according to the incoming user requests. 
Cloud computing provides different types of services like software as a service (SaaS), infrastructure as a 
service (IaaS) and platform as a service (PaaS). However, it presents a number of management challenges, 
because customers of these cloud services should have to integrate with the architecture defined by the cloud 
provider, using its specific parameters for working with cloud components. As the clients in the cloud 
ecosystem are increasing, it's good to find an efficient way to handle the clients' demand by maximizing the 
throughput and minimizing the response time for a given system. 

Increase in demand of computing resource for a system that uses a single server system can result to 
overload of the system [1], the main benefits of having multiple servers in a system is to efficiently increase the 
performance of the system by reducing overloads so that a system can handle request and allocate resources to 
those request effectively. If single server is used in a system then the services are provided by means of batch 
processing while for a system with multiples servers the services are provided by using either parallel system or 
sequential system [2]. In this paper we will show the variation in throughput and time delay when using a single 
server and multiple servers. 
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The other part of this paper is organized as follows: In section 2, we introduce some previous studies about 
resource sharing and cloud computing resource management. In Section 3, we give background information 
about cloud computing resource sharing and emphasize why we focus on the throughput and delay of single and 
multiple server models. We explain the mathematics of Queuing Theory in Section 4. In section 5, we provide 
simulation results and analysis. Concluding remarks are given in Section 6. 



II. RELATED WORK 

Effort have been put in trying to find an efficient way in which cloud users can be able to use cloud 
resources in a way that is very quick and efficiently. Reference [1] is based on two type of systems which are 
single servers system M/M/l and multiple server system M/M/n where by the waiting time by a client in the 
queue of each system is analyse in order to obtained the most efficient system. Satyanarayana et al. focused on a 
model for allocating resources to a job that arrived into the cloud using queuing model where the performance 
measures such as the mean number of request, throughput, utilization and mean delay in the system are analysed 
[3]. Authors in [4] made another effort in handling performance evaluation of cloud data center. All these works 
have something in common which is improving efficiency in the cloud environment. Pawar and Wagh improved 
resource utilisation through multiple SLA parameters (memory, network bandwidth and CPU time) and resource 
allocation by what is known as pre-emptive mechanism for high priority task execution [5]. Their work was 
based on another study made by Lugun in [6] which considers that job scheduling to be analysed with different 
QoS parameters required by each user and then builds a non-pre-emptive priority model for the jobs and also 
considers how the service providers can gain more profit from the offered resources. Bheda and Lakhani 
presented a dynamic provisioning technique that adapts itself to different workload changes and offers a 
guaranteed QoS. They modelled the behaviour and performance of applications and Cloud-based IT resources to 
adaptively serve user requests by using the queuing network system model and workload information for the 
physical infrastructure [7]. 

Reference [8] puts an emphasis on running multiple virtual machines (VMs)-with multiple operating 
systems and applications in order to address the resource allocation issues to guarantee QoS in virtualized 
environments like Cloud and Grid networks. The authors focused on the disk resource allocation studies rather 
than CPU, memory and network allocations. Cloud computing as a pool of virtualized computer resources 
spanned across the network is provided as a service over the Internet. User requests for web services arrive to 
these virtual servers and software based load balancing approaches are optimized for QoSs. The authors in [9] 
propose Stochastic Hill climbing algorithm for the allocation of incoming jobs to the servers or virtual machines. 
They compared the results with First Come First Serve (FCFS) algorithms. 



III. CLOUD COMPUTING RESOURCE SHARING 

Cloud computing fall in parallel and distributed computing, which is a collection of computers that are 
interconnected and virtualized as one computing resources and the client get access to the resources following 
agreement between the provider and the client otherwise known as SLA. As mentioned earlier, cloud computing 
offers software, platform, and infrastructure as a service respectively as shown in Fig. 1. The software as a 
service includes providing software such as Mail (e.g. Gmail, Yahoo mail), social network sites, Google drive, 
and so on, to the customers or clients. The infrastructure as a service deals with VM, storage, network, load 
balancer and so on as a service to the client and lastly the platform as a service deals with database like sql, 
oracle, web services, runtime (e.g. java) and so on as a service to the client. The clients get access to these 
services through various devices as shown in the figure below [10] [11]. 
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Fig. 1 Service types in Cloud Computing [http://en.wikipedia.org/wiki/Cloud_computing] 
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As stated in previous paragraphs [1], having multiple servers in a cloud computing environment increases 
the performance of the system effectively through the reduction of the mean queue length and the waiting time 
or delay when compared to the system with a single server. In this paper we follow the footsteps of some of the 
previous papers, but we will put our focus on the throughput and delay of the two models employed in our work 
to see which among the two models will be more efficient in terms of handling request for resource sharing. The 
utilization (occupancy) rate will further give us clue on how intense a system will be when dealing with the 
request in both models employed using the famous queuing system theory. 

IV. QUEUING MODEL/KENDALL NOTATIONS 

Queuing theory is a study of waiting line; it enables mathematical analysis of related process which 
includes arrival, waiting in the queue and being served by the server [12]. To understand the queuing system 
some notations where suggested by D. G. Kendall, the notations give standards to describe and classify the 
queuing system. A typical Kendall notation is given as A/S/C, where; 

• A = arrival time for requests 

• S = service time 

• C = number of servers 

There are other three notations that represent the number of buffers (available places in the system) as 
(K), calling population size as (N) and services discipline as (SD) which all are considered as infinite queue, 
population and FCFS (First-come-First-Served) service discipline in [13]. The arrival and service time in our 
work (A, S) follows Markovian process (M) whereby the arrivals follows exponential or Poisson distribution, 
the two notations used in our work includes, M/M/l and M/M/c. 

A typical queuing system consists of input, which are the requests seeking to be process, arrivals i.e. units 
that entered the system seeking resources, queue that houses the request seeking resources, service facilities that 
served the request and departure i.e. the units of request that have complete service and leave the system [13]. 

The queuing discipline is an important characteristics of queuing system where request are selected for 
service when queue is formed, the discipline can be; FCFS, random service selection (RSS) or priority system, 
where some requests are given higher priority. Number of service channels is another important characteristic of 
queuing system where the system can be either single or multiple servers [12]. 

Since the cloud computing environment can have either a single server system or multiple server 
system, this gives us the ability to use the queuing theory to mathematically show some relationships in terms of 
efficiency between the two systems. Below are some notations we used in our analysis; 

P 0 = Probability the system is empty 

R s = expected number of request in the service facility 

Rq = expected number of request in the queue 

R= expected number of unit in the system 

T s = expected time in the service facility 

T q = expected time in the system 

X = arrival rate of request 

\x = number of request completion 

Th= Throughput 

The queue is said to be stable if the service rate jli is greater than the mean arrival rate X i.e. "\i>X", so that the 
system will not keep growing forever, and whenever the system is busy, it will eventually reach a state where 
the system will be idle [13]. 

A. Single Server System (M/M/l ) 

For an M/M/l server system shown in Fig. 2, it means it is an exponential distribution that consist of 

- Exponentially distributed inter-arrival times. 

- Exponentially distributed service time. 

- Infinite population of potential request. 
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Fig. 2 M/M/l Service types in Cloud Computing 

P<1 is needed to assure the system is in equilibrium n= number of units (request) in the system (n> 0), for a 
system with one server and infinite request, using a derived reduced equations and the notation in (1) , the 
probability of n units in the system 
Traffic intensity for a single server system is given as; 



P = Vm 

Po = (l-p) 

P n = p tl (l -p)n > 0 

Expected request in the system facility is given by; 
R s = p 

Expected request in the queue 
R q = p V(l - p] 



(1) 
(2) 

(3) 



(4) 
(5) 



Expected number of request in the system is the summation of (4) and (5) which gives; 

R = p/(l - p) (6) 
The expected time in the service (T s ), is obtained by dividing (4) by A 

Z = % (7) 
The expected time (delay) in the queue (T„) for a single server system is given by; 



T 9 =RJX= p/[n(l-p)] 



(8) 



The throughput for M/M/l systems is given by 
» Th = A 



(9) 



B. Multiple Server System (M/M/c) 

In multiple server system shown in Fig. 3, the request join a single queue, where by the request will be 
served by single server in the system that is idle. The servers are identical and any request are be served by any 
server. 
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Fig. 3 M/M/c Service types in Cloud Computing 
[http://math.stackexchange.com] 

For a system with many servers and infinity request the probabilities are given as; 



Utilization of single server is given by 
p = 

Utilization of the system is given by r = 

Po = probability when there is no request in the system 



(10) 
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P 0 = 1/&£=D>-V»! + r ft /((fc - l)! (fc - p))} 



(11) 



Where n is the number of request in the system (n>0) and k is the number of service facilities (servers) 
The probability of n request in the system is given by 
f r n fn\ *p 0 = (0, k — 1) 

^ ~W(fc *p D n >k (12) 
Expected request in the queue (R q ) 

R q = p D r k p/k\(l-p} 2 (13) 

The expected time (delay) in the queue for the multiple server system is given by; 

7 fl =R q /X = (p ]3 * rVfcKfcjaXl -p) a (14) 

To obtain the throughput of the system we first find the throughput of a completed service in a given time which 
is obtained as; 

Th = kpu 

Th = k,i (15) 

It is obvious that a multiple server system will be more efficient in terms of performance; however it is 
important to put these facts into analysis, as it will help researchers to easily visualize the differences when 
employing such systems. Even in multiple server systems, those with more servers will perform better that those 
with less servers. Virtualization of these servers will make the systems more efficient when allocating resources 
to different user request. 



V. SIMULATION AND ANALYSIS OF RESULTS 

In our simulation we let "n=10", and for M/M/c, the number of service facility in the system "k=5". The 
table below shows the result of our simulations for both single server system and multiple server system. Our 
aim is to compare the throughput of each system to have a clear view of how using multiple server system is 
more efficient and time saving than using the single server system. The throughput is the number of completed 
request per unit time while the delay is the time taken by a request in the queue until it's been executed. The 
values we used in our simulation are randomly picked so as to conform with the pattern in which request for 
resources enters a given system. 

TABLE I 

ARRIVAL RATE, NUMBER OF REQUESTS, THROUGHPUT, DELAY 



0.06 
0.055 
0.05 
>0 045 
^ 0 04 
| 0 035 
P 0.03 
0.025 
0 02 
0.015 



X 




Throughput(Th) 


Delay (T q "\ 


M/M/l 


M/M/c 


M/M/l 


M/M/c 


15 


25 


15 


75 


0.0600 


2.68e-05 


30 


50 


30 


150 


0.0300 


1.34e-05 


45 


78 


45 


225 


0.0161 


6.67e-06 


60 


95 


60 


325 


0.02280 


1.12e-05 


85 


110 


85 


425 


0.0309 


1.47e-05 
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Fig. 4 Delay against the arrival time for M/M/l system 
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Fig. 5 Delay against the arrival rate for M/M/c system 

From the graphs (Fig. 5 and Fig. 6), we can see the waiting time or delay in the M/M/c crashed down to values 
which are less when compared to that of M/M/l and we can see the M/M/l graph keeps increasing. The M/M/c 
graph started to increase after it crashed down, however the rate at which it increases is very slow. 

Throughput against arrival rate for M/M/1 System 
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Fig. 6 Graph of throughput against the arrival rate for M/M/l system 

Throughput against Arrival rate M/M/c 
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Fig. 7 Graph of Throughput against the arrival rate for M/M/c 



Considering Fig. 6 and Fig.7, each shows the throughput of our respective models and we can visibly see the 
differences in terms of the request that can be served in each model, and from these figures we conclude that 
M/M/c is more efficient for the resource allocation. 

TABLE II 

ARRIVAL RATE, NUMBER OF REQUESTS, UTILIZATION RATE (%) 



A 




M/M/l (p %) 


M/M/c (p %) 


15 


25 


60 


6 


30 


50 


60 
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45 


78 


57.69231 


5.769231 


60 


95 


63.15789 


6.315789 


85 


110 


77.27273 


7.727273 
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Utilization Rate (occupancy) for M/M/1 System 
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Fig. 8 Utilization rate (%) against the arrival rate for M/M/1 system 



Utilization Rate (Occupancy) for M/M/c System 
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Fig. 9 Utilization rate (%) against the arrival rate for M/M/c system 

Fig. 8 and Fig. 9 shows the level of utilization (based on the values in Table 2) of a given system when 
handling the same request and depicts that Fig. 9 is less occupied than Fig. 8. 



VI. CONCLUSION 

In conclusion we can see clearly that the throughput of system with multiple serving units is above the 
system that has a single serving unit. And also when we consider the waiting time for a certain request before it 
can be executed on a single server is greater than when a multiple server system is employed. The utilization 
rate (occupancy) for the multiple server system is much lower than that of single server system, there for in our 
opinion, to have an efficient and reliable system in handling request for resources in cloud computing 
environment it is necessary to have a multiple server system, even for a multiple server, virtualization of each 
server will help more in increasing the efficiency in handling the systems activities. 
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